Integrating Cyc and Wikipedia: Folksonomy Meets Rigorously Defined Common-Sense
نویسندگان
چکیده
Integration of ontologies begins with establishing mappings between their concept entries. We map categories from the largest manually-built ontology, Cyc, onto Wikipedia articles describing corresponding concepts. Our method draws both on Wikipedia’s rich but chaotic hyperlink structure and Cyc’s carefully defined taxonomic and common-sense knowledge. On 9,333 manual alignments by one person, we achieve an F-measure of 90%; on 100 alignments by six human subjects the average agreement of the method with the subject is close to their agreement with each other. We cover 62.8% of Cyc categories relating to common-sense knowledge and discuss what further information might be added to Cyc given this substantial new alignment.
منابع مشابه
Wikipedia-based Semantic Interpretation for Natural Language Processing
Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was based on purely statistical techniques that did not make use of background knowledge, on limited lexicographic knowledge bases such as WordNet, or on huge manual efforts such as the CYC project. Here we propose a novel method, cal...
متن کاملWord Sense Disambiguation Using Wikipedia
This paper describes explorations in word sense disambiguation using Wikipedia as a source of sense annotations. Through experiments on four different languages, we show that the Wikipedia-based sense annotations are reliable and can be used to construct accurate sense classifiers.
متن کاملComputing semantic relatedness of words and texts in Wikipedia-derived semantic space
Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was either based on purely statistical techniques that did not make use of background knowledge or on huge manual efforts, such as the CYC projects. Here we propose a novel method, called Explicit Semantic Analysis (ESA), for finegrai...
متن کاملAssociating Semantics to Multilingual Tags in Folksonomies
Tagging systems are nowadays a common feature in web sites where user-generated content plays an important role. However, the lack of semantics and multilinguality hamper information retrieval process based on folksonomies. In this paper we propose an approach to bring semantics to multilingual folksonomies. This approach includes a sense disambiguation activity and takes advantage from knowled...
متن کاملThe importance of cross-lingual information for matching Wikipedia with the Cyc ontology
In this paper we try to answer the question how cross-lingual evidence may improve matching between di erent classi cation schemas. We concentrate speci cally on the task of mapping between Wikipedia categories and Cyc terms as well as the classi cation of Wikipedia articles to the Cyc taxonomy and show how this process may be improved by consuming the evidence that is available in di erent edi...
متن کامل